(N 

o 

(N 



O 



Game arguments in computability theory and 
algorithmic information theory 

Andrej Muchnik* Alexander Shen^ Mikhail Vyugin 

January 28, 2013 



Abstract 

We provide some examples showing how game-theoretic arguments (the approach that 
goes back to Lachlan and was developed by An. Muchnik) can be used in computability 
theory and algorithmic information theory. To illustrate this technique, we start with a 
proof of a classical result, the unique numbering theorem of Friedberg, translated to the 
game language. Then we provide game-theoretic proofs for three other results: (1) the 
gap between conditional complexity and total conditional complexity; (2) Epstein-Levin 
theorem relating a priori and prefix complexity for a stochastic set (for which we provide a 
new game-theoretic proof) and (3) some result about information distances in algorithmic 
information theory (obtained by two of the authors [A.M. and M.V.] several years ago but 
not yet published). An extended abstract of this paper appeared in lfl4l . 



It often happens that some result in computability theory or algorithmic information the- 
ory is essentially about the existence of a winning strategy in some game. This approach 
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was considered by A. Lachlan for enumerable setsu later it was (in different forms) used by 
An. A. Muchnik |[9l [10l [L0. In Section Q] we illustrate this approach by showing how a clas- 
sical result of recursion theory (Friedberg's theorem on unique numberings) can be translated 
into this language. In Section [2] we use game approach to relate total conditional complexity 
CT(x\y) (the minimal complexity of a total program that maps a condition y to some object 
x) and standard conditional complexity (where the program is not necessarily total). Then in 
Section [3] we provide a new game-theoretic proof of a recent result of Epstein and Levin 01. 
Finally, in Section @] we generalize the result of [[161 and show that for every natural numbers 
m,n and for every string xq of sufficiently high complexity one can find strings x\, . . . ,x m such 
that all the conditional complexities C (xi[X]) (for all i, j in {0, 1,2, ... , m} such that i ^ j; note 
that is allowed) are equal to n + 0(1) where the constant in 0(1) depends only on m (but not 
on n). 
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'As Lachlan writes in |8|, "our reason for studying basic games [the kind of games he defined] is that every 
theorem of T(M) [elementary theory of enumerable sets] known at the present time can be proved by consttucting 
an effective winning strategy for a suitable basic game." 
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1 Friedberg's unique numbering 



Our first example is a classical result of R. Friedberg [5 1: the existence of unique numberings. 

Theorem 1 (Friedberg). There exists a partial computable function F(-,-) of two natural vari- 
ables such that: 

(1) F is universal, i.e., every computable function /(•) of one variable appears among the 
functions F n : x i— > F(n,x); 

(2) all the functions F n are different. 

Proof. The proof can be decomposed in two parts. First, we describe some game and explain 
why the existence of a (computable) winning strategy for one of the players makes the statement 
of Friedberg's theorem true. In the second part we construct a winning strategy and therefore 
finish the proof. 

1.1 Game 

The game is infinite and is played on two boards. Each board is a table with an infinite number 
of columns (numbered 0,1,2... from left to right) and rows (numbered 0,1,2,... starting from 
the top). Each player (we call them Alice and Bob, as usual) plays on its own board. The 
players alternate. At each move player can fill finitely many cells at her/his choice with any 
natural numbers (s)he wishes. Once a cell is filled, it keeps this number forever (it cannot be 
erased). 

The game is infinite, so in the limit we have two tables A (filled by Alice) and B (filled by 
Bob). Some cells in the limit tables may remain empty; other contain natural numbers (one in 
each cell). The winner is determined by the following rule: Bob wins if 

• for each row in A-table there exists an identical row in 5-table; 

• all the rows in 5-table are different. 

Lemma 1. Assume that Bob has a computable winning strategy in this game. Then the state- 
ment of Theorem [7] is true. 

Proof. A table represents a partial function of two arguments in a natural way: the number in 
jth row and jth column is the value of the function on (z, j); if the cell is not filled, the value is 
undefined. 

Let Alice fill A-table with the values of some universal function (so the jth cell in the z'th row 
is the output of z'th program on input j). Alice does this at her own pace simulating in parallel 
all the programs (and ignoring Bob's moves). Let Bob apply his computable winning strategy 
against the described strategy of Alice. Then his table also corresponds to some computable 
function B (since the entire process is algorithmic). This function satisfies both requirements 
of Theorem Q] since A-function is universal, every computable function appears in some row 
of A-table and therefore (due to the winning condition) also in some row of 5-table. So B is 
universal. On the other hand, all B n are different since the rows of 5-table (containing B n ) are 
different. □ 
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Remark 1. If Alice had a computable winning strategy in our game, the statement ofTheo- 
rem\l}would be false. Indeed, let Bob fill his table with the values of a universal function that 
satisfies the requirements of the theorem (ignoring Alice 's moves). Then Alice fills her table in a 
computable way and wins. This means that some row of Alice 's table does not appear in Bob 's 
table (so his function is not universal) or two rows in Bob's table coincide (so his function does 
not satisfy the uniqueness requirement). 

So we can try the game approach even not knowing for sure who wins in the game; finding 
out who wins in the game would tell us whether the statement of the theorem is true or false 
(assuming that the winning strategy is computable). 

1.2 Winning strategy 

Lemma 2. Bob has a computable winning strategy in the game described. 

Proving this lemma we may completely forget about computability and just describe the 
winning strategy explicitly (this is the main advantage of the game approach). We do this in 
two steps: first we consider a simplified version of the game and explain how Bob can win in 
this simplified version. Then we explain what he should do in the full version of the game. 

In the simplified version of the game Bob, except for filling 5-table, may kill some rows 
in it. The rows that were killed are not taken into account when the winner is determined. So 
Bob wins if the final (limit) contents of the tables satisfies two requirements: (1) for each row 
in A-table there exists an identical valid (non-killed) row in 5-table, and (2) all the valid rows 
in 5-table are different. (According to this definition, after the row is killed its content does not 
matter.) 

To win the game, Bob hires a countable number of assistants and makes z'th assistant re- 
sponsible for ith row in A-table. The assistants start their work one by one; let us agree that z'th 
assistant starts working at move i, so at every moment only finitely many assistants are active. 
Assistant starts her work by reserving some row in 5-table not reserved by other assistants, and 
then continues by copying the current contents of z'th row of A-table (for which she is responsi- 
ble) into this reserved row. Also at some point the assistant may decide to kill the current row 
reserved by her, reserve a new row, and start copying the current content of z'th row into the new 
reserved row. Later in the game she may kill the reserved row again, etc. 

The instructions for the assistant determine when to kill the reserved row. They should 
guarantee that 

• if z'th row in the final (limit) state of A-table coincides with some previous row, then 
z'th assistant kills her reserved row infinitely many times (so none of her reserved rows 
remain active); 

• if it is not the case, i.e., if z'th row is different from all previous rows in the final A-table, 
then z'th assistant kills her row only finitely many times (and after that faithfully copies 
z'th row of A-table into that row). 

If this is arranged, the valid rows of 5-table correspond to the first occurences of rows with 
given contents in A-table, so they are all different, and contain all the rows of A-table. 

The instruction for z'th assistant: keep track of the number of rows that you have already 
killed in some counter k; if in the current state of A-table the first k positions in i-th row are 
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identical to the first k positions of some previous row, kill the current reserved row in B-table 
(and increase the counter); if not, continue copying i-th row into the current row. 

Let us see why these instructions indeed have the required properties. Imagine that in the 
limit state of A-table the row i is the first row with given content, i.e., is different from all the 
previous rows. For each of the previous rows let us select and fix some position (column) where 
the rows differ, and consider the moment T when these positions reach their final states. Let 
N be the maximum of the selected columns (in all previous rows). After step T the zth row in 
A-table differs from all previous rows in one of the first N positions, so if the counter of killed 
rows exceeds N, no more killings are possible (for this assistant). 

On the other hand, assume that zth assistant kills her row finitely many times and N is the 
maximal value of her counter. After N is reached, the contents of zth row in A-table is always 
different from the previous rows in one of the first N positions, and the same is true in the limit 
(since this rectangle reaches its limit state at some moment). 

So Bob can win in the simplified game, and to finish the proof of Lemma |2] we need to 
explain how Bob can refrain from killing and still win the game. 

Let us say that a row is odd if it contains a finite odd number of non-empty cells. Bob will 
now ignore odd rows of A-table and at the same time guarantee that all possible odd rows (there 
are countably many possibilities) appear in 5-table exactly once. We may assume now without 
loss of generality that odd rows never appear in A-table: if Alice adds some element in a row 
making this row odd, this element is ignored by Bob until Alice wants to add another element 
in this row, and then the pair is added. This makes the A-table that Bob sees slightly different 
from what Alice actually does, but all the rows in the limit A-table that are not odd (i.e., are 
infinite or have even number of filled cells) will get through — and Bob separately takes care 
of odd rows. 

Now the instructions for assistants change: instead of killing some row, she should fill some 
cells in this row making it odd, and ensure that this odd row is new (different from all other odd 
rows of the current 5-table). After that, this row is considered like if it were killed (no more 
changes). This guarantees that all non-odd rows of A-table appear in 5-table exactly once. 

Also Bob hires an additional assistant who ensures that all possible odd rows appear in 
5-table: she looks at all the possibilities one by one; if some odd row has not appeared yet, 
she reserves some row and puts the desired content there. (Unlike other assistants, she reserves 
more and more rows.) This behavior guarantees that all possible odd rows appear in 5-table 
exactly once. (Recall that other assistants also avoid repetitions among odd rows.) Lemma |2] 
and Theorem \T\ are proven . □ 

Remark 2. Martin Kummer in his note O observes that the property "i-th enumerable set is 
different from all preceding ones " is 0' -enumerable and therefore the set of minimal indices can 
be represented as the range of a limit-computable function. This remark can be used instead of 
explicit construction, though it is less adapted to the game version. 

2 Total conditional complexity 

In this section we switch from the general computability theory to the algorithmic information 
theory and compare the conditional complexity C(x\y) and the minimal length of the program 
of a total function that maps y to x. The latter quantity may be called "total conditional com- 
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plexity" (see, e.g., HI.) It turns out that total conditional complexity CT(x\y) can be much 
bigger than C(x\y). But let us recall first the definitions. 

The conditional complexity of a binary string x relative to a binary string y (a condition) 
is defined as the length of the shortest program that maps y to x. The definition depends on 
the choice of the programming language, and one should select an optimal one that makes the 
complexity minimal (up to 0(1) additive term). When the condition y is empty, we get (uncon- 
ditional plain) complexity of x. See, e.g., |[T3l for more details. The conditional complexity of 
x relative to y is denoted by C(x\y); the unconditional complexity of x is denoted by C(x). 

It is easy to see that C(x\y) can also be defined as the minimal complexity of a program 
that maps y to x. (This definition coincides with the previous one up to 0(1) additive term; any 
programming language that allows effective translations from other programming languages 
can be used.) But in some applications (e.g., in algorithmic statistics, see lfT5l ) we are interested 
in total programs, i.e. programs that terminate on every input. Let us define CT(x\y) as the 
minimal complexity of a total program that maps y to x. In general, this restriction could 
increase complexity, but how significant could be this increase? It turns out that these two 
quantities may differ drastically, as the following simple theorem shows (this observation was 
made by several people independently; the first publication is probably [1 [ Section 6.1]). 

Theorem 2. For every n there exist two strings x n and y n of length n such that C (x n \y n ) = 0(\) 
but CT(x„\y n ) > n. 

Proof. To prove this theorem, consider a game G n (for each n). In this game Alice constructs a 
partial function A from B" to B", i.e., a function defined on (some) n-bit strings, whose values 
are also n-bit strings. Bob constructs a list B\ , . . . , of total functions of type B' 1 — > M n . (Here 
B = {0,1}.) 

The players alternate; at each move Alice can add several strings to the domain of A and 
choose some values for A on these strings; the existing values cannot be changed. Bob can 
add some total functions to the list, but the total length of the list should remain less than 
2 n . The players can also leave their data unchanged; the game, though infinite by definition, 
is essentially finite since only finite number of nontrivial moves is possible. The winner is 
determined as follows: Alice wins if in the limit state there exists a rc-bit string y such that A(y) 
is defined and is different from all B\ (y) , . . . , B^iy) . 

Lemma 3. Alice has a computable {uniformly in n) winning strategy in this garnet 

Before proving this lemma, let us explain why it proves Theorem[2l Let (for every n) Alice 
play against the following strategy of Bob: he just enumerates all the total functions of type 
B' 1 — > M n that have complexity less than n, and adds them to the list when they appear. (As 
in the previous section, Bob does not really care about Alice's moves.) The behavior of Alice 
is then also computable since she plays a computable strategy againt a computable opponent. 
Let y n be the string where Alice wins, and let x n be equal to A(y n ) where A is the function 
constructed by Alice. 

It is easy to see that C(x n \y n ) = 0(1); indeed, knowing y n , we know n, can simulate the 
game, and find x n during this simulation. On the other hand, if there were a total function of 
complexity less than n that maps y n to x n , then this function would be in the list and Bob would 
win. 

2 Since the game is effectively finite, in fact the existence of a winning strategy implies the existence of a 
computable one. But it is easy to describe the computable strategy explicitly. 
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So it remains to prove the lemma by showing the strategy for Alice. This strategy is straight- 
forward: first Alice selects some y and says that A (y) is equal to some x. (This choice can be 
done in arbitrary way, if Bob has not selected any functions yet; we may always assume it is 
the case by postponing the first move of Bob; the timing is not important in this game.) Then 
Alice waits until one of Bob's functions maps y to x. This may never happen; in this case 
Alice does nothing else and wins with x and y. But if this happens, Alice selects another y and 
chooses x that is different from B\ (y), . . . ,Bk(y) for all total functions B\,. . . that are cur- 
rently in Bob's list. Since there are less than 2 n total functions in the list, it is always possible. 
Also, since Bob can make at most 2" — 1 nontrivial moves, Alice will not run out of strings y. 
Lemma [3] and theorem |2] are proven . □ 

A well-known result of Bennett, Gacs, Li, Vitanyi and Zurek [|2l says that if C(jc|y) and 
C(y\x) are small (do not exceed some k), there exists a program of complexity at most k + 
0(logk) that maps x to y and at the same time maps y to x (given an additional advice bit that 
says which of these two tasks it should perform). The natural question arises: is a similar state- 
ment true for total conditional complexities and computable bijections? The (partly negative) 
answer is provided by the following theorem (a sketch of its proof is given in IfTOl . but some 
important details are missing there): 

Theorem 3. Let x and y be two binary strings of length at most n. Then there exists a program 
t that computes a permutation of the set of all binary strings, maps x toy and 

C(t)<CT(x\y)+CT(y\x) + 0(\ogn). 

This bound cannot be improved significantly: for every k and n such that n > 2k there exist two 
strings x and y of length n such that C (x) , C (y) < k + 0(log n) but any program for a bijection 
that maps x toy has complexity at least 2k — 0(1). 

Note the difference with non-total result mentioned earlier: now instead of maximum of 
C(x\y) and C(y|x) we need their sum. 

Proof. The first part is simple. Having two total programs p (mapping x to y) and q (mapping 
y to x) and knowing n, we compute a one-to-one correspondence between two sets of strings 
of length at most n: string u corresponds to v if p(u) = v and q{y) = u at the same time. (This 
correspondence can be effectively computed as a finite object, since both p and q are total 
according to our assumption.) Then we extend this correspondence to a permutation of the set 
of all strings of length at most n; one more extension gives a computable permutation of the set 
of all binary strings (we may assume, for example, that all longer strings are mapped to itself). 
The progam t obtained in this way can be effectively constructed given p, q and n, so we get 
the required bound. (Note that both CT(x\y) and CT(y\x) do not exceed n, therefore forming a 
pair from p and q can be done with O(logrc) -overhead.) 

For the second part, we again consider a game. Let X and Y be sets that contain 2" elements 
(recall that n > 2k). Alice can mark some elements in X or Y, not more than 2 k elements in 
each set. Bob can list (sequentially) some bijections between X and Y, at most 2 2k ~ 2 bijections. 
Winning condition: Bob wins if for every marked element x EX and for every marked element 
y EY there exists a bijection in the list that maps x to y. 

It is easy to see that Bob can win if 2 2k ~ 2 is replaced by 2 2k : when Alice marks new ele- 
ments, he forms a bijection for every new pair of marked elements, and adds all these bijections 
to the list; in total there are at most 2 k ■ 2 k such pairs. But 2 2k ~ 2 bijections are not enough: 



6 



Lemma 4. Alice has a winning strategy in this game. 

Let us explain why this is enough to prove the theorem. Let X = Y = W (the set of n-bit 
strings). Let Alice play against Bob who generates all programs of complexity less than 2k — 2 
and runs them (in parallel) on all elements of X; when he finds that some program computes a 
bijection between X and Y, this bijection is added to the list. Since Alice wins, there are some 
marked elements x and y that are not connected by any bijection in the list. These elements are 
determined by n, k, and their ordinal number in the enumeration; the latter can be encoded by k 
bits since there is at most 2 k marked elements in each set (so we get O(logn) + A: bits in total). 

This argument assumes that Alice's strategy is computable given n and k; as before, we may 
note that existence of some strategy implies the existence of a computable one, or look at the 
actual strategy below. 

It remains to show a (computable) winning strategy for Alice. She starts by marking arbi- 
trary elements x\ E X and y\ E Y and then waits until Bob provides a bijection that connects 
them. After that, Alice chooses (again arbitrarily) some element X2 ^ x\ and waits until X2 is 
connected with y\ (Bob needs a new bijection for that, since the old one connects x\ and y\). 
Then Alice switches to Y and chooses a new element j2 not connected to x\ , X2 by existing 
bijections, and waits until Bob adds two new bijections connecting V2 to x\ and %i. Then she 
continues in the same way, alternating between X and Y . At each step she takes an element 
not connected by existing bijections to existing elements on the other side. If Alice is able to 
continue this process, then for each new pair of marked elements a new bijection is needed, so 
the total number of bijections should be at least 2 2k . 

Things are not so simple, however: it may happen that all elements of X (or Y) are already 
connected to some marked elements^, so Alice cannot choose x EX that is not connected to 
any marked element of Y by any listed bijection. However, Alice can get at least half of new 
pairs each time. Indeed, assume that she selects an element in X; let us show that she can select 
an element that is connected to less than half of marked elements in Y. Each marked element 
in Y is connected to at most 2 2k ~ 2 elements in X, so the probability that a (uniformly) random 
element in X is connected to random marked element in Y is at most 1/4. Therefore, for some 
element in X only 1 /4 (or less) marked elements in Y are connected to it, and Alice may choose 
this element. This argument saves at least half of the pairs, so the total number of bijections 
needed to cover all pairs is at least 2 2k ~ l , more than Bob has. Lemma is proven. □ 

3 Epstein-Levin theorem 

In this section we discuss a game-theoretic interpretation of an important recent result of Ep- 
stein and Levin [|4]|. This result can be considered as an extension of some previous observa- 
tions made by Vereshchagin (see lfT5l ). Let us first recall some notions from the algorithmic 
information theory. 

For a finite object x one may consider two quantities. The first one, the complexity of x, 
shows how many bits we need to describe x (using an optimal description method). The second 
one, a priori probability of x, measures how probable is the appearance of x in a (universal) 
algorithmic random process. The first approach goes back to Kolmogorov while the second 

3 There are 2 2k bijections and 2 k marked elements, so at most 2 M elements can be connected; we know only 
that n is greater than 2k, not 3k. 
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one was suggested earlier by SolomonoffJ^ The relation between these two notions in a most 
clean form was established by Levin and later by Chaitin (see [3] for more details). 

For that purpose Levin modified the notion of complexity and introduced prefix complexity 
K(x) where programs (descriptions) satisfy an additional property: if p is a program that out- 
puts x, then every extension of p (every string having prefix p) also outputs x. (Chaitin used 
another restriction: the set of programs should be prefix-free, i.e., none of the programs is a 
prefix of another one; though it is a significantly different restriction, it leads to the same notion 
of complexity up to 0(1) additive term.) 

The notion of a priori probability can be formally defined in the following way. Consider 
a randomized algorithm M without input that outputs some natural number and stops. The 
output number depends on the internal random bits (fair coin tosses) by M. For every x there 
is some probability m x to get x as output. The sum £m x does not exceed 1; it can be less if 
the machine M performs a non-terminating computation with positive probability. In this way 
every machine M corresponds to some function x m x . There exists a universal machine M 
of this type, i.e., the machine for which function x m x is maximal up to a constant factor. 
For example, M can start by choosing a random machine in such a way that every choice has 
positive probability, and then simulate the chosen machine. We now fix some universal machine 
M and call the probability m x to get x on its output a priori probability of x. 

The relation between prefix complexity and a priory probability is quite close: Levin and 
Chaitin have shown that K (x) = — log 2 m x + 0( 1 ) . However, the situation changes if we extend 
prefix complexity and a priori probability to sets. Let X be a set of natural numbers. Then we 
can consider two quantities that measure the difficulty of a task "produce some element of X": 

• complexity of X, defined as the minimal length of a program that produces some element 
inX; 

• a priori probability of X, the probability to get some element of X as an output of the 
universal machine M. 

As we have mentioned, for singletons the complexity coincides with the minus logarithm of a 
priori probability up to 0(1) additive term. For an arbitrary set of integers this is no more the 
case: complexity can differ significantly from the minus logarithm of a priori probability. In 
other words, for an arbitrary set X the quantities 

maxm x and V m * 

xex xTx 

(the first one corresponds to the complexity of X, the second one is a priori probability of X) 
could be very different. For example, if X is the set of strings of length n that have complexity 
close to n, the first quantity is rather small (since all m x are close to 2~ n by construction) while 
the second one is quite big (a string chosen randomly with respect to the uniform distribution 
on n-bit strings, has complexity close to n with high probability). 

Epstein-Levin theorem says that such a big difference is not possible if the setZ is stochas- 
tic. The notion of a stochastic object was introduced in the algorithmic statistics. A finite object 
X (in our case, a finite set of strings) is called stochastic if, informally speaking, X is a "typ- 
ical" representative of some "simple" probability distribution. This means that there exist a 
probability distribution P with finite domain (containing X) and rational probabilities such that 

4 Solomonoff also mentioned complexity as a technical tool somewhere in his paper. 
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(1) P has small complexity, and (2) the randomness deficiency of X with respect to P, defined as 
— logP(X) —K(X\P), is small. (Note that here we speak about complexity of X and P, where 
X is a finite set of strings, and P is a distribution on finite sets of strings. These notions are well 
defined, since the complexity of a finite object does not depend on the choice of its computable 
encoding, up to 0(1) additive term.) Here K(X\P) stands for conditional prefix complexity of 
X given P, see lfT3l for details. 

Epstein-Levin theorem is essentially a result about some type of games (we call them 
Epstein-Levin games). To define such a game, fix a finite bipartite graph E C L x R with 
left part L and right part R. A probability distribution P on R with rational values is also fixed, 
as well as three parameters: some natural number k, some natural number I and some positive 
rational number 8. After all these objects are fixed, we consider the following game. 

Alice assigns some rational weights to vertices in L. Initially all the weights are zeros, but 
Alice can increase them during the game. The total weight of L (the sum of weights) should 
never exceed 1 . Bob can mark some vertices on the left and some vertices on the right. After 
a vertex is marked, it remains marked forever. The restrictions for Bob: he can mark at most 
/ vertices on the left, and the total P-probability of marked vertices on the right should be at 
most 8. The winner is determined as follows: Bob wins if every vertex y on the right for which 
the (limit) total weight of all its L-neighbors exceeds 2~ k , either is marked itself (at some point), 
or has a marked (at some point) neighbor. 

Evidently, the task of Bob becomes harder if / or 8 decrease (he has less freedom in marking 
vertices), and becomes easier if k decreases (he cares about less vertices). So the greater k and 
the smaller 8 is, the bigger / is needed by Bob to win. The following lemma gives a bound 
(with some absolute constant in O-notation): 

Lemma 5. For I = 0(2 k \og(l / 8) Bob has a computable winning strategy in the described 
game. 

Before proving this lemma, let us explain the connection between this game and the state- 
ment of Epstein-Levin theorem. Vertices in R are finite sets of integers; vertices in L are 
integers, and the edges correspond to G-relation. Alice's weights are a priori probabilities of 
integers (more precisely, increasing approximations to them). The distribution P on R is a 
simple distribution (on a finite family R of finite sets) that is assumed to make X (from Levin- 
Epstein theorem) stochastic. Bob may mark X, but this would make it non-random with respect 
to P (marked vertices form a P-small subset and therefore all have big randomness deficiency), 
so Epstein and Levin do not need to care about X any more. If X is not marked and has big 
total weight (= the total a priori probability), X is guaranteed to have a marked neighbor. This 
means that some element of X is marked and therefore has small complexity (since there are 
only few marked elements); this is what Epstein-Levin theorem says. (Of course, one needs to 
use some specific bounds instead of "small" and "large" etc., we provide the exact statements 
after the proof of the lemma.) 

Proof. To prove the existence of a winning strategy for Bob, we use the following (quite un- 
usual) type of argument: we exhibit a simple probabilistic strategy for Bob that guarantees 
some positive probability of winning against any strategy of Alice. Since the game is essen- 
tially a finite game with full information (see the comments at the end of the proof about how 
to make it really finite), either Alice or Bob have a winning strategy. And if Alice had one, no 
probabilistic strategy for Bob could have a positive probability of winning. 
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Let us describe this strategy for Bob. It is rather simple: if Alice increases weight of some 
vertex x in L by an additional £ > 0, Bob responds by tossing a coin and marking x with 
probability c2 k e, while c > 1 is some constant to be chosen later. We need also to specify what 
Bob does if c2 k £ > 1 (this always happens if £ is 2~ k or more). In this case Bob marks x for 
sure. Note also that without loss of generality we may assume that Alice increases weights one 
at a time, since we can split her move into a sequence of moves. 

We have explained how Bob marks L- vertices; if at some point this does not help for some 
i?-vertex, i.e., this vertex has total weight at least 2 ~ k but no marked neighbors, Bob immedi- 
ately marks this 7?-vertex (as well as all other vertices with this property). 

The probabilistic strategy for Bob is described, and we need to consider some (determinis- 
tic) strategy a for Alice and show that the probability of winning the game for Bob (for suitable 
c, see below about the choice of c) is positive when playing against a. By construction, there 
are two reasons why Bob could lose the game: 

• the total measure of marked R- vertices exceeds <5; 

• the number of marked L- vertices exceeds /. 

To show that with positive probability none of this events happen, we ensure that probability of 
each event is less than 1/2. For that we show that the expected P-measure of marked i?-vertices 
is less than 6/2 and the expected number of marked L- vertices is less than 1/2. 

Let us fix some y and estimate the probability for y to be marked by Bob (= to have no 
marked neighbors when the sum of weights of y's neighbors achieves 2~ k ). Assume that the 
weights of neighbors of y were increased by £i, . . . , e u during the game, and now J^£ ; > 2~ k . 
After each increase the corresponding neighbor of y was marked with probability c2 k £i, so the 
probability that all the neighbors remain not marked, does not exceed 

(1-c2*£i)-...-(1-c2*£ m ) <e- c2t ( £ i + - +e ") < e - c 

(recall that (l—t) < e~' and that ££,• > 2~ k ). Therefore for every measure P the expected P- 
measure of marked vertices on the right (the weighted average of numbers not exceeding e~ c ) 
does not exceed e~ c . So it is enough to let c be ln( 1 / 8) + O ( 1 ) . 

In fact, this picture is oversimplified: the estimate for probability should be done more care- 
fully, since the values of £i , . . . , £ u may depend on Bob's moves. The situation can be described 
as follows: our opponent (following some probabilistic strategy) tells us some numbers in [0, 1] 
(one by one). After the opponent names some £, we perform random coin tossing with proba- 
bility of success £. Then for every t the probability of the event "at the moment when the sum 
of numbers exceeds t, we still have no successful trials" does not exceed e~' . (To prove this 
statement formally, we need a backward induction in the tree of possibilities.) 

The expected number of marked L-vertices can be estimated in the same way. Here the 
opponent also gives us some numbers whose sum is guaranteed not to exceed some t (t = c2 k in 
our case), and we use them as probabilities of success for random coin tosses. Similar argument 
shows that the expected number of successes does not exceed t. We need t = c2 k < 1/2, so we 
take / = c2 k + 2 = 2 fc+2 (ln(l/S) + 0(1)) = 0(2*log(l/5)). 

To finish the proof of the lemma, one last remark is needed. To make our arguments (a 
transition from a probabilistic strategy to a deterministic one) correct, we need to make the 
game finite. One may assume that current weights of vertices on the left all have the form 2~ m 
for some integer m (replacing weights by approximations from below, we can compensate for 
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an additional factor of 2 by changing k by 1). Still the game is not finite, since Alice can start 
with very small weights. However, this is not important: the graph is finite, and all very small 
weights can be replaced by some 2~ w . If 2~ m -#L < 1, then the sum of weights still does not 
exceed 2, and this again is a constant factor. 

□ 

Now we can apply this Lemma to prove Epstein-Levin theorem. Let us first give exact def- 
initions. A finite object X is called a- /3 -stochastic if there exists a finite probability distribution 
P (with finite support and rational values, so it is a finite object) such that 

• K(P) does not exceed a; 

• the deficiency d(X\P), defined as — \ogP(X) —K(X\P), does not exceed /3. 

Theorem 4 (Epstein-Levin). If a finite setX is a- /3 -stochastic, and its total a priori probability 
Hxex m x exceeds 2~ k , then X contains some element x such that 

K(x)<k + K(k)+ \ogK(k) + a + 0(log jS) + 0(1) . 

The sum Lxex m x can be called a priori probability of the problem "produce some el- 
ement of X", and min XIE xK(x) can be called prefix complexity of the same problem. The 
Epstein-Levin theorem guarantees that for a-/3- stochastic sets X with small a and /3 the prefix 
complexity is logarithmically close to the minus logarithm of a priori probability. 

Proof. We follow the plan outlined above. Let P be the finite probability distribution that makes 
X stochastic. This means that K(P) < a and d(X\P) = - log P (X) - K(X\P) < j8. Consider 
Epstein-Levin game where R is the support of P, the left-hand side L is the union of all sets in R 
and edges connect each set U £ R to all its elements. To describe the game completely, we need 
to specify parameters k, I, and 8. The parameter k is taken from the statement of our theorem; 
8 = 2~ d where d will be chosen later, and I = 0(2 k \og(l/8)) = 0(d2 k ) is determined by k 
and d as described in Lemma|5] (This guarantees that Bob has a winning strategy in the game.) 
Then we let Bob play in this game against Alice who assigns (in the limit) weight m x to every 
element x £ L. 

We will choose d in such a way that all marked elements in R have deficiency greater that 
j8; our assumptions then guarantee that X is not marked. Lemma|5]then guarantees that X has a 
marked neighbor, i.e., that some element of X is marked. It remains to estimate the complexity 
of marked elements in L. 

Why marked elements in R have high deficiency? We know that the total measure of marked 
elements in R does not exceed 2 ~ d . Consider the semimeasure P' that equals 2 d P on marked 
elements and otherwise; P' can be enumerated if P, d, and k are given, so 

K(U\P,d,k) < -logP\U)+0(l) 

for every U in R. If U is not marked, this is trivial (the right hand side is infinite); for marked 
U we have 

K(U\P,d,k) < -\ogP(U)-d + 0(\) 

and therefore 

K(U\P) < -\ogP(U) -d + K(d) +K(k) + 0{1), 



11 



so 

d(U\P) >d-K(d) -K(k)-0(\) 
for all marked U in R. So wee need the inequality 

d-K(d)-K(k)-0(\) > j8 

to ensure that X is not marked. This is guaranteed for sure if 

d = 2(P+K(k)) + 0(l) 

(we do not care about constant factor in d since only log J will be used in the complexity bound 
below). 

After d is chosen, we need to estimate the complexity of marked elements in L. They can 
be enumerated given P, k, d and there is at most 0(2 d) of them, so for every marked xeLwe 
have 

K(x\P,k,d) <k + \ogd + 0(l) 

and 

K(x) <K(P)+K(k,d)+k + \ogd + 0(l). 
Recalling that K(P) < a and d = 2(j8 + K(k)) + 0(\), we get 

K(x) < a + K(k,K(k),p)+k + \ogp+\ogK(k)+0(l) < 

< a+K(k,K(k)) + K([5) + k + \ogp +logK(k) + 0(1); 

it remains to note that K(k,K(k)) = K(k) and that K(fi) = O(logjS). □ 

4 Information distance 

Consider the following problem. Let m be some constant. Given a string xq and integer n, 
we want to find strings xi,...,x m such that C(x ( -|xy) = n + 0(l) for all pairs of different i,j in 
the range 0, . . . ,m. (Note that both i and j can be equal to 0). This is possible only if xq has 
high enough complexity, at least n, since C(xo\xj) is bounded by C(xq). It turns out that such 
xi, . . . ,x m indeed exist if C(xq) is high enough (though the required complexity of xq is greater 
than n), and the constant hidden in 0(1) -notation does not depend on n (but depends on m). 

This statement is non-trivial even for n = 1 : it says that for every n and for every string x 
of high enough complexity there exists a string y such that both C(x\y) and C(y\x) are equal to 
n + 0(1). This special case was considered in ffT6ll . the condition there is C(xq) > In (which 
is better than provided by our general result). Later flLTI a different technique (using some 
topological arguments) was used to improve this result and show that C(xq) > n + 0(logn) is 
enough. 

Here is the exact statement that specifies also the dependence of 0(1) -constant on m: 
Theorem 5. For every m and n and for every binary string xq such that 

C(xo) > n(m 2 + m+ 1) + O(logm) 
there exist strings x\ , . . . , x m such that 

n < C(xi\Xj) < n + O(logm) 
for every two different i, j G {0, . . . , m}. 
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Note that the high precision is what makes this theorem non-trivial (if an additional term 
0(logC(xo)) were allowed, one could take the shortest program for xq and replace first n bits 
in it by m independent random strings). 

Proof. Let us explain the game that corresponds to this statement. It is played on graph with 
(m+l) parts Xo, . . . ,X m . There are countably many vertices in each part Xi (representing possi- 
ble values of xi); we will assume that all Xi are disjoint copies of the set B* of all binary strings. 
As usual, there are two players: Alice and Bob. Alice may connect vertices from different parts 
by undirected edges, while Bob can connect them by directed edges. Alice and Bob make al- 
ternating moves; at each move they can add any finite set of edges. Alice can also mark vertices 
xq in Xq. The restrictions are: 

• Alice may mark at most m 2"+ 1 +" m ( m + 1 ) vertices (in Xq); 

• for each vertex X[ G X t and for each j ^ i, Alice may have at most m(m + 1)2" undirected 
edges connecting X{ with vertices in Xj; 

• for each vertex xt € Xi and for each j ^ i, Bob should have less than 2" outgoing edges 
from Xi to vertices in Xj. (Note that the number of incoming edges is not bounded.) 

The game is infinite. Alice wins if (in the limit) for every non-marked vertex xq E Xq there 
exist vertices x\,. . . ,x m from X\, . . . ,X m such that every two vertices Xi,xj (where i ^ j) are 
connected by an undirected (Alice's) edge, but not connected by a directed (Bob's) edge. 

Lemma 6. Alice has a computable winning strategy in this game. 

It is easy to see how this lemma can be used to prove the statement. Imagine that Bob draws 
an edge Xi — > xj when he discovers that C{xj\xi) < n. Then he never violates the restriction. 
Alice can computably win against this strategy; every marked vertex then has small complexity, 
since a marked vertex can be described by its ordinal number in the enumeration order. This 
ordinal number requires 

log(m2" +1+m "( m+1 )) = logm + 0(1) + n + m 2 n + nm 

bits, and to describe the game we need additional 0(\ogn) + O(logm) bits to specify m and n, 
so we get 

C(xq) < n{\ +m + m 2 ) +<9(logm) +0(\ogn). 

We want to conclude that xq is not marked (since it has high complexity), but the bound we 
have is slightly weaker than needed, it has additional term 0(\ogn). To get rid of this term, 
we note that (for given m) the bounds for the number of marked vertices grow exponentially 
with n, so we can describe all marked vertices (for given m and for all n) simultaneously, and 
the overhead in the complexity caused by marked vertices for smaller values of n is bounded 
by 0(1). 

For every non-marked vertex xq there exist x\,. . . ,x m that satisfy the winning conditions. 
For them C(xj\xi) > n (otherwise Bob would connect them by a directed edge), and C(xj\xi) < 
n + O(logm), since xj can be obtained from X[ if we know z, j, and the ordinal number of 
undirected edge Xi-xj among all the edges that connect Xi to Xj, in the order of appearance of 
those edges in the game. 
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So it remains to prove the lemma. To make clear the idea of the proof, let us first consider 
the case m = 1. In this case we deal with two countable sets Xq and X\, Alice's degree is 
bounded by 2" +1 and the total number of marked vertices should not exceed 2 3 " +1 . To explain 
Alice's strategy, let us tell a story first. 

Imagine a "marriage agency" whose business is to form pairs (xo,xi) of elements xq e Xo 
and xi £ X\. After a pair is formed (or at some later moment), each of the "partners" (elements 
of the pair) may "complain" about the other one. Then the pair is dissolved and both elements 
become free. Later agency can try them with new partners. 

The mission of the agency is to provide stable pairs for everybody or almost everybody. Of 
course, this is not always possible: imagine that some element complains about all partners. 
Moreover, even if additionally require that each element makes less than M complaints, it may 
happen that for some x all its partners complain about x (still making less than M complaints 
each), and the agency cannot do much for x. 

However, by clever planning the agency can control the damage and ensure that 

• agency makes at most 2M attempts to find a partner for any given element (never trying 
the same partnership twice); 

• all elements of Xo, except for at most 2M 3 "hopeless" ones, ultimately get a stable part- 
nership, and hopeless elements are explicitly marked. 

(Note that the last requirement treats Xq and X\ in a non- symmetrical way.) 

The agency can achieve its goals using the following strategy. First it chooses an arbitrary 
bijection between Xq and X\ and creates all corresponding pairs. Then it treats complaints one 
by one: if some xq complains about its current partner x\ or vice versa, the pair {xq,x\) 1S 
dissolved. Then agency tries to find a new partner for xq among elements of X\ with matching 
experience. 

The last requirement is the crucial point of our argument: it means that in the new pair 
the number of complaints made by one partner should be equal to the number of complaints 
received by the other one. In this way an unlucky element who was rejected M — 1 times will 
get a partner who made M — 1 complaints and therefore is unable to complain again. So nobody 
will be rejected M or more times. 

The bad news is that sometimes for an element xq fr° m a dissolved pair there is no partner 
with matching experience; in this case xq is declared "hopeless" and never considered again. 
We should estimate the maximal number of hopeless elements. We can encode "experience" as 
a pair of two integers in range [0,M), so there are at most M 2 possible values of this parameter, 
and hopeless elements can be divided into M 2 classes. Let us show that in each class there are at 
most 2M elements. Since elements in Xq and X\ change their experience simultaneously (when 
a complaint is made), and newly formed pairs are made of matching elements, free elements in 
X\ also form M 2 classes of the same cardinalities. If there are already 2M hopeless elements 
in some class, there are also 2M matching free elements. New hopeless element in this class 
cannot appear since one of there matching free elements can be used to form a new pair. (Recall 
that each element can send less than M complaints and receive less than M complaints, so one 
of the 2M free elements of matching experience was not tried yet.) 

One last remark about the agency's strategy: we started with making infinitely many pairs 
(using some bijection between Xq and X\) at once. It is not important, since actual implemen- 
tation of this decision can be made gradually (we think about some pairs as existing, but they 
are not yet informed about that). 
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Now we explain how this story can be transformed into Alice's strategy in the game de- 
scribed. The parameter M (bound for the number of complaints) is 2"; then 2M equals 2" +1 
and 2M 3 equals 2 3w+1 , as the lemma requires for m = 1. When agency makes a pair, Alice 
draws an (undirected) edge between elements of the pair. When the pair is dissolved, an edge 
(of course) does not disappear, but Alice does not care about it any more, considering only 
"active" edges (that correspond to currently existing pairs). When Bob draws a (directed) edge 
x — > y that is parallel to one of the active edges (the undirected edge x-y), the agency sees that x 
complains about y (and, according to this complaint, dissolves the pair x-y). When Bob draws 
an edge that is not parallel to an active edge, this edge is ignored until parallel active edge 
appears (corresponding pair is established); then this old edge becomes a complaint and the 
newly formed pair is dissolved. (If Bob draws an edge that is parallel to an old inactive edge of 
Alice, this edge never will change anything.) Finally, agency's declaration that some xq £ is 
hopeless means that Alice marks xq. 

It is easy to see that the agency's behavior described above can be transformed into Alice's 
strategy, so Alice indeed has a (computable) winning strategy for the case m = 1 . 

After these preparations let us consider the general case. The idea remains the same, but 
instead of two sets Xq and X\ we now have m+l components Xq,X\, . . . ,X m . Instead of pairs, 
we have now cliques made of m + 1 elements, one per component. A participant of a clique 
may complain about some other participant, and in this case the clique is dissolved (and an 
attempt to create a new clique for the Xo-element of the dissolved one is performed — again Xo 
gets a preferential treatment). 

The clique is represented by Alice's edges between all its elements, m(m + l)/2 edges in 
total. A directed Bob's edge xi — > xj that connects two elements xi and Xj of one of the currently 
active cliques, is understood as a "complaint" of X[ againts Xj. (Other edges created by Bob are 
delayed complaints, as before). 

The important change is how the "experience" is defined. Each vertex remembers m(m+ 1) 
non-negative integers corresponding to ordered pairs This tuple / = {I p , q } (where p,q e 
{0,1, ... ,m} and p ^ q) is called an "index" of a vertex. When x\ complains about xj (both are 
elements of the same clique (xo, . . . ,x m )), all participants of this clique note this and increase 
-component of their index (initially filled with zeros) before the clique is dissolved. Note 
the difference: now each element Xi knows not only how many complaints it made (7,-j is the 
number of complaints about X 7 -elements) or received (fy is the number of complaints received 
from X 7 -elements), but also the number of complaints between other components (where X[ is 
only a witness). 

After one elements of a clique complains about another one, all elements of the clique up- 
date their indices, and the clique is dissolved. To find the new clique for the element xq EX 
from the dissolved clique, we search for free elements with the same index in all the compo- 
nents. Moreover, it is needed that these elements never have sent complaints about each other 
(but it is OK if some of them were in the same clique, later dissolved because of some other 
complaint). If this is possible, a new clique is formed; if not, xq becomes marked ("hopeless") 
and other elements of the dissolved clique remain free (outside the cliques). 

Since only elements with the same index are combined into cliques, and the indices are 
updated synchronously, the number of free elements (that do not belong to active clique) is 
the same for all components (in general and for each value of the index). Note also that all 
the numbers in the indices are less than 2 n (since each of them is a number of complaints sent 
by some X[ to some Xj). When element changes the clique, its index increases along some 
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coordinate, so the number of changes is at most m(m + 1)2", and each change creates m new 
edges adjacent to this element (one per component). So for every element and for each j 
there are at most m(m + 1)2" undirected edges that connect xt to vertices in Xj. 

To finish the proof of Lemma [6} it remains to prove the bound for the number of marked 
vertices (= hopeless elements in Xq). For that we estimate the number of marked vertices of each 
index (recall that the number of possible indices is bounded by 2 wm ( m+1 ) since its components 
are less than 2"). The idea here is simple: if we have many (at least 2m2") free vertices of some 
index, we can always find a clique (made of them) for every vertex xq e Xq of that index that 
lost its old clique. Indeed, we find clique elements sequentially in X\, . . .,X m ; at every step we 
can find a vertex that has no complaints about already selected vertices and vice versa, since the 
number of complaints in both directions is less than 2 • 2" for each of the components (less than 
2" for each direction), and in total less than 2m2" elements in the next component are unusable 
due to previous ones. □ 
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